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Abstract 

Using results from quantum filtering theory and methods from classical control theory, we 
derive an optimal control strategy for an open two-level system (a qubit in interaction with 
the electromagnetic field) controlled by a laser. The aim is to optimally choose the laser's 
amplitude and phase in order to drive the system into a desired state. The Bellman equations 
are obtained for the case of diffusive and counting measurements for vacuum field states. A 
full exact solution of the optimal control problem is given for a system with simpler, linear, 
dynamics. These linear dynamics can be obtained physically by considering a two-level atom 
in a strongly driven, heavily damped, optical cavity. 



1 Introduction 



The advent of quantum information theory and the ever increasing experimental possibilities to 
implement this theory on real physical systems e.g. [2], has created great demand for a theory 
on the control of quantum systems. Since qubits, i.e. two-level quantum systems, make up the 
hardware for quantum information processing, one important question is how to optimally control 
or engineer their states. Many problems of quantum computation can be formulated in terms of 
quantum optimal control of unitary or decohering gates. Most previous work on the optimal control 
of qubit states use an open loop strategy with a variational calculus approach to optimization |18| , 
|22j. |19j . However, in order to apply controls one must consider the qubit as an open quantum 
system which gives the possibility for time-continuous non demolition measurements and thus a 
closed (feedback) loop strategy would be more advantageous. In this paper, we employ a feedback 
strategy using dynamic programming which is a globally optimal solution to the control problem 
and thus extends the previous locally optimal variational approaches. 



The importance of feedback control theory in the control of open quantum systems was first 
recognized by Belavkin in Like in the classical case with partially observed systems, a feedback 
control strategy is usually favorable to the open loop control (without feedback) . Optimal feedback 
control strategies for the open quantum oscillator appeared even earlier in [3] and a quantum 
Bellman equation for optimal feedback control was introduced in 6 for a general diffusive and a 
counting measurement process. An interest in optimal quantum control and stability theory has 
recently emerged in the optics community EH- HH 
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As it was shown in the above papers, since we never have complete observability of quantum 
systems, the problem of quantum feedback control must involve a filtering procedure in order to 
measure and control the system optimally. We can separate these two problems as was suggested 
in [S] and consider first the problem of quantum filtering 0], [7], [5], [TJ]|. In quantum filter- 
ing theory pioneered by Belavkin in the quantum filtering equation for the system with a 
chosen continuous non demolition measurement has to be derived. A system observed through its 
interaction with the electromagnetic field by continuous measurement of some field observables, 
needs to be updated continuously in time to incorporate the information gained by the measure- 
ment. That is we have to condition the quantum state of the system on the obtained measurement 
results continuously in time. The quantum filtering equation as it was first introduced in 0], [H] 
is a stochastic differential equation for the conditioned state in which the innovation process, rep- 
resenting the information gain, is one of the driving terms. Like in the quantum optics literature, 
we take the filtering equation as our starting point, however, the driving Wiener process is not 
treated as the noise, but as an innovation process. For more background on the derivation of this 
stochastic equation as a general filtering equation in an open quantum system conditioned with 
respect to a non demolition observation, see [Hj, |10|. 

Once the quantum filtering equation is obtained, we are left with a classical control problem. In 
particular, if the state of a qubit is parameterized by its polarization vector in the Bloch sphere, 
i.e. a vector in the 3-dimensional unit ball providing sufficient coordinates for the system [5], the 
filtering equation provides stochastic dynamics for the polarization vector. The control is present 
in the dynamics through Rabi oscillations, which perform rotations of the polarization vector in 
the Bloch sphere caused by a laser driving the qubit. The phase and intensity of the laser are the 
parameters that can be controlled. 

The main aim of this paper is to demonstrate the relevance of classical control and quantum filtering 
when controlling quantum systems. This is shown by the example of optimal control of a two-level 
quantum system. A cost function, which is a measure of optimality of the control, is introduced 
and the corresponding Bellman equations are derived for this system. From these equations, we 
produce an optimal control strategy which depends on the solutions to the corresponding Hamilton- 
Jacobi-Bellman equation. In general these solutions are very difficult to find, even numerically, 
so we resort to a physically motivated simplification of the dynamics by considering a qubit in 
strongly driven, heavily damped, optical cavity. This enables us to present an exact solution to 
the control problem. 

The remainder of the paper is organized into four main sections. Firstly we describe the model 
and introduce the dynamics of the polarization vector from the filtering equation for diffusive and 
counting measurement for an initial vacuum field state. The next section describes the principle 
of optimality which is the key idea behind optimal feedback control and enables us to derive the 
Bellman equations in Section 4. We finish the paper with the simpler model corresponding to 
a two-level system in a strongly driven, heavily damped, optical cavity. Here we obtain a linear 
filtering equation, which we use with a quadratic cost function to give an exact solution for the 
optimal feedback control strategy. 
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2 The model and state dynamics 



We consider a two-level system, i.e. a qubit, in interaction with the quantized electromagnetic 
field in the weak coupling limit pp. This means that the unitary dynamics of the qubit and 
the field together in the interaction picture is given by a quantum stochastic differential equation 
(QSDE). In this way the field acts as non-commutative noise on the qubit. The initial state of 
the noise (electromagnetic field) is taken to be the vacuum state and the reduced dynamics of the 
qubit is given by a master equation. Such a quantum Langevin model was the starting point of 
the quantum stochastic theory of continuous non demolition measurements developed in 

We control the state of the qubit by its interaction with a laser beam. This laser beam is given by 
a channel in the field, called the forward channel, which is in a coherent state ipiu), where u is a 
square integrable complex valued function of time. The control function u induces Rabi oscillations 
which we must choose carefully to rotate the state of the qubit in the desired manner. The rest of 
the field is called the side channel. We assume that there is no direct scattering between the two 
channels. Following |B] and [7], we consider two different continuous time measurement schemes 
to be performed in the side channel. The first measurement scheme we consider is a homodyne 
detection experiment which measures the field quadrature Y t — (t) + A s (t) which is a classical 
diffusive observable process at the output of the quantum system. The second scheme is a counting 
experiment, counting the number Y t = N t of fluorescence photons emitted by the qubit up to time 
t. 



Since the side channel and atom are in interaction, we gain information on how the state of 
the qubit changes from the measurement results of the homodyne detection experiment or the 
counting experiment. The state of the qubit conditioned on the measurement result uj of the non 
demolition output process Y t is a random state. This means it is a map p* from the possible paths 
of measurement results Q to the 2 x 2-density matrices, mapping w e O to the density matrix 
plj which represents the state of the qubit conditioned on a path of measurement results u up to 
time t. Note that for homodyne detection, a path w of measurement results is just the path of the 
photocurrent from time to time t. For the counting experiment a path of measurement results 
is given by the collection of times at which photons were detected. 

The conditional state evolution of the qubit is given by a classical stochastic differential equation 
for the density matrix p* called the quantum filtering or Belavkin equation [H], [TJ]]. For the 
homodyne detection experiment we take the quantum filtering qubit equation derived in [S] with 
respect to the diffusive output process Y t , as our starting point 



(1) 



dpi = L( P i)dt + (y,p t m + f t m v;-Tr{y a pt + piv;)p t m 

(dY t -Tr(V s pl+piV;)dtj 
where 

V s := Ks V with V := ^ ^ , (2) 
and fi^ is the decay rate into the side channel. Furthermore, the Lindblad term L is given by 

L(p) = -i[H, p] + v fP v; - \{vfVf, P } + v sP v; ~ \{v:v s , P }, (3) 
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with time dependent controlling Hamiltonian 

/ -*«/«(t)\ 

and with Vf := «/V where is the decay rate into the forward channel. We choose units such 
that Kg + = 1. The form of the Hamiltonian physically relates to two orthogonal control fields 
corresponding to the real and imaginary parts of the complex control function u(t). The innovating 
martingale (second line of ijTJl) is just a Wiener process Wt which describes the information gain 
from the measurement, i.e. the observed process Y t satisfies the stochastic differential equation 

dY t := Tr(V s pl + p\V*)dt + dW u (4) 

and the Belavkin diffusion filtering equation can be written as a stochastic master equation 

dpi = L{ P \)dt + (v s pl + P \v; - tt(v sP i + P iv;)pi) dw t . (5) 

For a£l 3 we introduce the notation <r(a) :— a\o x + aio~ y + a^a z , where a x , o~ y and o~ z denote the 
Pauli spin matrices. The states of a qubit can be parameterized by vectors in the Bloch sphere 
B := {p e R 3 ; ||p|| < 1}. The random vector with which we parameterize the state /?* is denoted 
P* and is called its polarization vector, i.e. we write 

Introducing := K/Re(ii(t)) and :— Kflm(u(t)) we can write the filtering equation JSJ as 

/ -m-2utPt \ (i+pi- P f\ 

dP*= -|P* + 2u t -P z * ]dt+\ -1*1* \n s dW t . (6) 

V-(l + P l z ) + 2U+PI - 2u~ t PlJ \-Pi{\ + Pi)) 

For the counting experiment, the Belavkin filtering equation derived in |H] reads as 

dpi = L{ P l)dt + ( Tr ^j^,) - pi) ( dN * - K(Vs P lv;)dt) , 

where L and V s are given by © and J5J), and N t is the random variable counting the number of 



dN t -^-(l + pt)dt). (7) 



detected photons up to time t. In parameterized form this reads 

/ -\Pl-2utPt \ ( - p l \ 

dP ( = -\Pl + 2u- t Pl )dt+[ -Pl ( 

\-{l + Pi)+2uipt-2uiPl) \-(l + Pl)! 



3 The principle of optimality 

In order to find an optimal quantum feedback control strategy based on the continuous non demoli- 
tion observation we shall use the dynamic programming method for the sufficient qubit coordinate 
P* as it was suggested in 
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At time t = the qubit is taken to be in a known initial state P°. It is our objective to bring 
it in the <r z -up state at time t = T, at which the control experiment is stopped. This is done by 
choosing the laser intensity and phase, given in terms of uf and u~j~ , at every time t which may 
depend on the stochastic state P* of the qubit at time t, via a feedback mechanism. The total cost 
of the control experiment from time up to time T is described by 

J:=(l-Pj)+ f {uf + uf)ds. (8) 
Jo 

The first term reflects our main objective which is to bring the system in the a z -up state at time 
T, whereas the second term reflects the cost for using the laser. The second term restricts our 
resources. Without this restriction it would be possible to apply brute force, e.g. a very strong 
laser pulse at the end of the experiment, to obtain our goal. 

Note that the total cost J of equation (JSJ) is a random variable. It depends on the stochastic 
measurement results through the random variable and the applied controls and u~[ , which 
in their turn depend on the random state P* of the qubit. From equation JSJ it follows that the 
expected cost-to-go J(i, P') at time t when we are in the state P* at time t, is given by 



J(t,P 4 ) :=E pt [(l-Pj) +J (u+ +u~ )ds\, (9) 

where E P t denotes the expectation over all possible paths of measurement results conditioned on 
the fact that we arc in state P* at time t. The problem addressed in this paper is how to choose the 
feedback controls and at every time t, such that the total expected cost J(0, P°) (= E P o [J]) 
is minimal. The solution to this problem, i.e. a map /i* : [0, T] x B — > R 2 assigning numbers uf 
and to every time t and state P* that minimize J(0, P ), is called an optimal strategy. Here the 
star * in fi* denotes optimality of the strategy. Extending this convention we denote the minimal 
or optimal cost by J*(0,P°). 

A key observation in this problem is that if we have a strategy fi* s T j , < s < T that is optimal 
over the interval [s,T] (i.e. one which minimizes J(s,P s ) for every possible state P s at time s) 
then the optimal strategy fi* of the whole experiment coincides with fxf T , when restricted to 
the interval [s,T]. So we optimize over disjoint intervals, working backwards in time to build an 
optimal strategy for the whole experiment. This idea is called the principle of optimality j^j and 
lies at the heart of optimal feedback control theory. 

Iteration of the principle of optimality enables a recursive solution to the optimal control problem 
known as dynamic programming [H]. To illustrate this method we divide the time interval [0,T] 
into N parts of equal size A := T/N. The principle of optimality leads for < n < N to the 
following recursive dynamic programming equation [3], ^7] 

J*(n,P")= min {e p „ Uu+ 2 + uf) A + J* (n + 1, P" +1 )) }, (10) 

with boundary condition J*(N, P N ) = 1 — . Using the state dynamics, P™ +1 can be expressed 
in terms of P", u+ and u~. The minimization of (|10l) working backwards from n = N — ltori = 
yields the optimal control strategy (u+, u~) = [i*(n, P"). 

In the next section we derive a partial differential equation for the expected optimal cost to go J* 
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by studying equation (|10|l with boundary condition J*(T, P T ) = 1 — Pj in infinitesimal form 

J*(t, P 4 ) = min {e p * ((u+ 2 + uf)dt + J*(t + dt, P t+dt ))\. (11) 

u+,u t >- V ' J 

This is done by using the state dynamics for p*+"* and by subsequently expanding J* up to 
the correct order according to Ito's formula. Solving the obtained partial differential equation is 
equivalent to running the dynamic programming algorithm and therefore provides a solution to 
the optimal control problem. 



4 Bellman equations 

In this section we first consider the case where we are measuring the field quadrature Y t = A* (t) + 
A s (t) by a homodyne detection scheme. The dynamics are given by equation According to 
Ito's formula we have 

dJ*(t,P t )=d t J*(t,'P t )dt+ 9 M J*{t,P t )dP^ + 

H=x,y,z 

1 (12) 

2 E ^{t^dPldPt, 

^i,u—x,y,z 

where denotes partial differentiation of J*(t, P*) with respect to P^ and d* v denotes partial 
differentiations with respect to P^ and P* where we assume that J* is suitably differentiable. 
Using the state dynamics ©, the differentials dP^ can be expressed in terms of dt and dWt and 
products of differentials can be evaluated using Ito's rule dW t dW t — dt. Since the expectation 
of the innovating martingale is zero, i.e. Ep* [dW t ] — 0, equation Hll|l simplifies a great deal by 
substituting J*(t + dt, T> t+dt ) = J*(t, P') + dJ*(t, P*) and using JT^ to obtain 




with boundary condition J*(T, P T ) = 1 — Pj . In control theory, equation (|13[1 is known as the 
Bellman equation and it was introduced into quantum feedback control theory in 
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We evaluate the minimum in the Bellman equation by completing the squares on and 



lit - 2u+P t z d x J* + 2utP t z d y J* + {2u+P l x - 2utPl)d z J* 



_2 

H ~~<~ a t ~ ""'t 1 z u x u ~r ± z u y u -r \*ui t ± x — *u, t -I y 

(«+ + {P l x d z r - P l z d x r)) 2 - (p%r - Pld x rf + 

(V + (PldyJ* ~ P&J*)) 2 - (PtdyJ* - P*0,J*) 2 . 

In this way we find an optimal control strategy in terms of the partial derivatives of the optimal 
expected cost-to-go, given by 

u+ = Pld x r - p x d z r, u~ t = p*d z r - Pld y r. (u) 

where the optimal expected cost-to-go J* is the solution to the following second order non-linear 
partial differential equation 

-d t J* = 

>4({p? - 1 - pD^dlyj* + (pf - 1 - p*)p^(i + Pl)dl y r + pf p*(i + Pl)d 2 yz r) + 



(-P&J* + \ptd y r + (i + Pl)d z r^ - {p%r - p%rf - (p^r - p*a 2 r)\ 

(15) 

with boundary condition J*(T, P T ) = 1 — P^ . This type of equation is called a Hamilton- J acobi- 
Bellman (HJB) equation. However, even finding a numerical solution to this equation is still a 
very hard problem which is beyond the scope of this paper. In the following section we will look 
at a system with much simpler dynamics for which we can actually solve the HJB-equation. 

In the remainder of this section we turn our attention to the situation where we count photons in 
the side channel. We consider the same problem as before, i.e. we want to find optimal controls uf 
and u~[ depending on P* for each time t, such that the total expected cost J(0, P°) of equation 
is minimal. Since N t is a jump process, we use the Ito rule dN t dN t = dN t and the Ito formula 
for calculating dJ*(t, P*) has also changed. Using the dynamics J7J) we find 

dJ*(t,P*) = cV*(i,P')* + (|^(1 + Pi) - \PI - 2u+P t z )d x J*{t 1 V t )dt + 



(^p<(i + Pi) - \pl + 2utPl) r y (t, p*)dt 



2 



(16) 



2 (1 + PlY - (1 + Pl) + 2U+PI - 2u~Pl)d z r{t, P f )dt + 

1 r(t,p t + Q*)-j*(t,p t )W t) 



where Q in the difference term is given by 



-Pl 

Q* := | -Pl | , i-e. P 1 + Q l 
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Starting from the equation JTTJ}, usin S + dt , P t+dt ) = J*(t, P*) + dJ*(t, P 4 ), Ito's formula 
(|16|l and the fact that Ept [dAT t l = %■(! + P%)dt, we find the following Bellman equation for the 
photon counting case (cf. |14|1 

—d t J* = min J m+ 2 + V 2 - 2ufP*d x J* + 2u^P t z d y J* + (2u+P* - 2u^P*)d z J* \ + 

|(i + Pi) p 4 + q') - p*)) + (|p 4 (i + p 2 4 ) - ip*)fit r + 
+ Pi) - \pl)d y r + (y(i + - (i + 

where the partial derivatives are all evaluated at (t, P 4 ) and the boundary condition is J*(T, P T ) = 
1 — Pj . Completing the squares leads again to an optimal control strategy given by equation l|14fl . 
where J* in the case of photon counting has to satisfy the following HJB equation 

-d t r = f (i + Pl) (j*(t, p* + q*) - J*(t, p*j) + p*(i + Pl) ~p*W* + 



2 

+ Pt) \pl)d y r + (|(i + Pi) 2 - (i + Pi))d z r - 
(pld z r - Pid x r) 2 - (Pid y r - p&j*) 2 , 

with boundary condition J*(T, P T ) = 1 — Pj. Solving this equation is again beyond the scope of 
this paper. 



5 A simpler model 



As we have discovered in the previous section, realistic optimal control problems usually lead to 
very difficult Bellman equations. In this section we will study a drastically more simple model 
with linear dynamics given by 

dpi = L(pl)dt - ia[a z ,pi]dW t , (18) 

with a a real constant and 

2 ( i,j ^ , Ao 



l (p) = -i[B t a z ,p] +a 2 (cr z p<J z - -{<r 2 z ,p}), where a z = Q _ 1 

In parametrised form equation (|18|l reads as 

-2a 2 Pl - 2P t P 4 \ (-2alf 
dP* = ( -2a 2 P t y + 2B t P t x \ dt+ I 2aP* \ dW t . (l!)) 



The dynamics of equation (|18(l corresponds to a two-level atom in a strongly driven, heavily 
damped, optical cavity as in [201, The cavity field is assumed to be far off resonance with 

the atomic transition. The cavity is aligned along the z-axis and instead of controlling with a 
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laser beam as in the previous sections, we now control the atom with an external magnetic field B t 
aligned along the z-axis. At the output of the cavity we measure the quadrature Yt — i(A*(t)—A(t)) 
by a homodyne detection scheme. Adiabatic elimination of the cavity dynamics 1Q then leads to 
the dynamics of equation (|18fl . The constant a is determined by properties of the cavity and the 
probe beam |20| . 

From the dynamics (|18|l it follows that dP\ = and furthermore we have 

d(Pf + Pf) = 2P l x dPl + dPldPl + IPldPl + dPldPl = 0, 



i.e. our problem reduces to a problem on a circle. Let us re-parameterize by introducing r and 
Ot such that P* — rcosBt and P* — rsm<d t for <d t G [— tt, tt). Then the dynamics are given by 
dr = and 

dO t , = 2B t dt + 2adW t . (20) 

Replacing 1 — Pj in the cost functions JHJ and J^Jl by 0^ will change our goal to bringing the 
system as close as possible to the ovup state at time t = T. It leads to the following expected 
cost-to-go function 



J(t, 6 t ) := E Qt 



T 

e|+ / bUs 



The optimal control problem is now of linear quadratic type, i.e. the filtered dynamics are linear and 
the cost function quadratic. Linear quadratic problems are well studied and are exactly solvable, 
cf. dU, HZ]. 

Starting from Ijllfl. using Ito's formula, we find the following Bellman equation 

—d t J* = min {bI + 2B t d e J*\ + 2a 2 d 2 gg J\ 

with boundary condition J*(T, Qt) — @t- Completing the squares on Bt leads to an optimal 
control strategy 

B t = ~d e J*, (21) 
where J* satisfies the following HJB equation 

-d t J* = -d e J* 2 + 2a 2 d 2 g J*, (22) 

with boundary condition J*(T, Qj>) — This equation is solved by making the Ansatz 

j*(t,e t ) = e 2 j(t) + 9 (t), 

for some functions / and g. Substituting this in shows that we have to choose 

9 = -4a 2 /, 

with boundary condition g(T) = 0. Furthermore / has to satisfy the Ricatti equation 

r = 4/ 2 , 

with boundary condition f(T) = 1. Solving these equations leads to the following expression for 
J* ' 

J*(t, e t ) - 4{T e l ) + 1 + « 2 |4(T -t) + l\, 
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which satisfies <|22ll as is easily checked. Equation l|21|) now easily leads to an optimal control 
strategy given by 

B ' = W^T-r (23) 

Summarizing, at time t we have a found measurement result w, integrating the dynamics (|20[1 we 
find the state 6t(w) and from equation (|23|) we can determine the optimal control field B t (uj) to 
be applied at time t. 



6 Discussion 



In this paper we have studied the feedback control of a qubit in interaction with the electromagnetic 
field. The Belavkin quantum filtering equation has been our starting point. Introducing the Bloch 
vector P* as a sufficient statistic, as suggested in [S], we obtained generally non- linear equations 
for the dynamics. In these equations the laser's phase and amplitude, represented by uf and , 
entered as the control parameters. The goal of the control was presented by a cost function J. We 
proceeded by using the method of dynamic programming [H] to find the optimal feedback control 
strategy. In infinitesimal form the dynamic programming algorithm leads to the HJB-equation for 
the optimal cost-to-go function J* . The optimal control strategy can be expressed in terms of the 
solution to this equation. 

Since the filter equation in general provides non-linear dynamics the resulting HJB-equation is often 
very difficult to solve and we have kept this outside the scope of this article. Linear dynamics are 
obtained for systems in which the interaction with the environment is essentially commutative 
|15| . This means the qubit couples only to one classical noise of the field. The linear dynamics 
are obtained only when the observed process Y t is exactly this classical noise in the field. In the 
last section of the article we have studied an example of a system for which the dynamics are 
linear. Together with the quadratic cost function this lead to an HJB-equation that could be 
solved exactly, providing an explicit expression for the optimal control strategy. 
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